Dynamic Lexicon Generation for Natural Scene Images
نویسندگان
چکیده
Many scene text understanding methods approach the endto-end recognition problem from a word-spotting perspective and take huge benefit from using small per-image lexicons. Such customized lexicons are normally assumed as given and their source is rarely discussed. In this paper we propose a method that generates contextualized lexicons for scene images using only visual information. For this, we exploit the correlation between visual and textual information in a dataset consisting of images and textual content associated with them. Using the topic modeling framework to discover a set of latent topics in such a dataset allows us to re-rank a fixed dictionary in a way that prioritizes the words that are more likely to appear in a given image. Moreover, we train a CNN that is able to reproduce those word rankings but using only the image raw pixels as input. We demonstrate that the quality of the automatically obtained custom lexicons is superior to a generic frequency-based baseline.
منابع مشابه
Text Recognition and Retrieval in Natural Scene Images
In the past few years, text in natural scene images has gained potential to be a key feature for content based retrieval. They can be extracted and used in search engines, providing relevant information about the images. Robust and efficient techniques from the document analysis and the vision community were borrowed to solve the challenge of digitizing text in such images in the wild. In this ...
متن کاملNatural scene text localization using edge color signature
Localizing text regions in images taken from natural scenes is one of the challenging problems dueto variations in font, size, color and orientation of text. In this paper, we introduce a new concept socalled Edge Color Signature for localizing text regions in an image. This method is able to localizeboth Farsi and English texts. In the proposed method rst a pyramid using diff...
متن کاملDepth Structure Preserving Scene Image Generation
Key to automatically generate natural scene images is to properly arrange among various spatial elements, especially in the depth direction. To this end, we introduce a novel depth structure preserving scene image generation network (DSP-GAN), which favors a hierarchical and heterogeneous architecture, for the purpose of depth structure preserving scene generation. The main trunk of the propose...
متن کاملDeep Learning of Hierarchical Structure
Hierarchical and recursive structure is commonly found in inputs from the richest sensory modalities, including natural language sentences and scene images. But such hierarchical structure has traditionally been a strong point of both structured and supervised models (whether symbolic of probabilistic) and a weak point of both neural networks and unsupervised learning. I will present some of ou...
متن کاملIntermediate View Generation of Soccer Scene from Multiple Videos
This paper introduces a novel method for generating an intermediate view of soccer scene taken by multiple video cameras. In the proposed method, soccer scene is classified into dynamic regions, a field region, and a background region. Using epipolar geometry in the first region and homography in the second, dense correspondence is obtained to interpolate views. For the third region, partial ar...
متن کامل